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ABSTRACT 

The author contends that model misspecif ication can 
occur even after researchers have selected the generally most 
appropriate class of methods, or general linear model techniques. It 
is suggested specifically that canonical correlation analysis may 
provide more mean ingful, results , as compared with regression, 
particularly if analysis is augmented by the computation of structure 
coefficients. It is also suggested that contemporary analytic 
practice reflects some improvements over more traditional practice* 
Researchers are increasingly investigating multivariate problems with 
multivariate methods. Greater use of the multivariate general linear 
model, or canonical correlation analysis, augmented by the 
calculation of appropriate coefficients, including structure 
Coefficients, is proposed for future research. (DWH) 
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ABSTRACT 



The purpose of this paper is to argue that model 



misspecification can occur even once researchers have selected 
the generally most appropriate class of methods, i.e., general 
linear model techniques. More specifically, it is suggested 
that canonical correlation analysis may provide more 
meaningful results than other general linear model techniques, 
particularly if analysis is augmented by the computation of 
structure coefficients. Several trends in recent 

methodological practice are discussed. 
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For the past several decades social scientists have 
periodically reviewed typical analytic practice with a view 
toward improving methodology. For example, Cohen (1968) 
suggested that some researchers use analysis of variance 
techniques when general linear model techniques would be more 
appropriate; Thompson^l981 ) comments on a possible etiology 
for and some consequences of this situation. Clark (1973) 
suggested that research might be more profitable if more 
researchers employed "random" and "mixed" effects models; 
Will son (1982) suggests that this form of "model 
misspecif ication" continues today. Marascuilo and Levin 
(1976) have cautioned against the dangers of Type. IV errors, 
i.e., the incorrect interpretation of a correctly rejected 
hypothesis; : they suggested that these errors may be 
particularly likely when interaction effects and post hoc 
tests are interpreted. Thus, the literature suggests that 
model misspecif ication, in its general sense, occurs at 
various levels, including the selection of class of analytic 
technique, the selection of error terms with which to test 
omnibus effects, and the testing of post hoc comparisons. 

The purpose of this article is to demonstrate that model 
misspecif ication can occur even once researchers have selected 
the generally most appropriate class of methods, i.e., general 
linear model techniques. More specifically, it is suggested 
that canonical correlation analysis may provide more 
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meaningful results, as compared with regression, even when the. 
search-context is^e^^coiis tan^^Thomps^ ( L982b ) provides 
a review of canonical methods; a computer program which 
implements some recent extensions of canonical methods is also 
available (Thompson, 1982a). 

Heuristic Example 
Table 1 presents hypothetical data which can be used to 
make the -discussion more concrete. The hypothetical case 
involves three predictor variables: pupil self-concept, 
income of the pupils' families, and the per-pupil expenditure 
of the pupils' schools. The researcher has two options with 
respect to selection of criterion variables. Composite 
achievement scores are available, or the researcher can 
consider both the reading and the math achievement subtest 

score?. _ 

r 

INSERT TABIlE 1 ABOUT HERE. 

Even though the hypothetical study did not involve any 
experimental manipulation, some researchers confuse 
design-choice consequences with analytic-choice consequences, 
and might dichotomize or trichotomize the three predictor 
variables and perform ANOVA or MANOVA analyses. Presume, 
however, that the researcher did not elect to distort the 
reality that the data. are supposed to represent; this can 
occur when normally-distributed, intervally-scaled variables 
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converted to uni f ormlly-di stributed, nominally-scaled 
riabtes-fftntpty in order to perform OVA techniques. Happily, 
the hypothetical researcher has selected a general linear 
model framework for the analysis. 

Three analytic options then become available. First, the 
researcher might perform a multiple regression analysis, 
employing composite test scores as the sole criterion 
variable. Second, tne researcher migh£ perform two multiple 
regression analyses employing the reading and math subtest 
scores as separate criteridh variables in the two analyses. 
Or, finally,, the researcher might perform a canonical 
correlation analysis which simultaneously considers both the 
two subtest criterion variables and the three predictor 
variables. *The results associated with these three options 
are all presented in Table 2. 

INSERT TABLE 2 ABOUT HERE. > 



The Table 2 results make clear that analytic choices ©can 

/ 

t 

have noteworthy impacts on interpretation, even when the 
choices all fall within the same analytic framework, and 'even 
when the various criterion variables are substantially 
correlated with each other. For example, the equation 
"weights" and structure coefficients for the pupil expenditure 
variable tend to differ across the solutions. The estimates 
of the predictive effectiveness of the equations also tend to 
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fluctuate somewhat across solutions. This raises questions 
regarding the appropriate analytic choices in such situations. 
The answers to these questions may have implications for 
decisions in other situations as well. 



ts a general rule, researchers should employ more rather 
than fewer criterion variables in their studies. In 
education, most variables have both multiple causes and 
multiple effects. Researchers should employ analytic 
techniques which honor the complex nature of the reality to 
which the researcher is attempting to generalize. As 
Kerlinger (1973, p. 149) argues, "to account for the complex 
psychological and sociological phenomena of education requires 
design and analytic tools that are capable of handling the 
complexity, which manifests itself above all in multiplicity 

■ 

of independent and dependent variables.". 

Thus, in cases like the hypothetical situation presented 
fiere, the use of the two subtest achievement scores would have 
been preferable to the use of the single composite score 
variable. The only empirical case for the use of composite 
rather than subtest scores is that composite scores' tend to be 
more reliable than thei'r component subtest scores. On -the 
basis of superficial thought, some researchers seem to believe 
that "longer" tests are always more reliable than "shorter" 
tests, as a function of some mysterious effects of test length 
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pgr se. Actually, test length affects reliability only 
insofar as length may affect variability, which is usually the 
most . direct determinant of reliability (see Gronlund, 1976,^ 
p. 119, for a readable explanation). In any case, it is also 
important to remember that improvements in reliability which 
are derived by increasing the number of test items can also 
paradoxically result in decreased test validity. 

Given that multiple criterion variables are generally of 
interest to researchers, it can be argued that canonical 
methods frequently provide important analytic benefits. For 
example, the calculation of separate correlational analyses 
for .. multiple criterion variables usually inflates the 
probability of making Ty^ I errors, depending on the degree 
pot correlation among the criterion variables. Futhermore, 
si^h approaches .distofit. reality to the. extent that ignoring 
relationships among th/ criterion variables can also distort 
'the .substantive interpretation of results, as noted in the 
heuristic example; this distortion is almost as unfortuarite 
as the Procrustean application of OVA techniques in 
non-experimental studies (Thompson, 1981). 

Incidentally, the Table 2 results also provide an 
opportunity to comment of the common but unfortunate failure 
to calculate structure coefficients in 6orrelat ional research. 



Fo^ example, few researchers report structure coefficients 
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when multiple regression techniques are applied, even thought 

one of the most useful ways to 'look at the 
regression function is in terms of its 
correlations with the predictor elements 
on which it is defined,.,-. Our tendency 
& to deemphasize the b[eta]» weights stems 
from experience with the phenomenon of 
extreme fluctuation of regression weights 
*' from sample to sample when the sample size 

is small. Even when the sample size is 
moderate there is substantial fluctuation 
(Cooley & Lohnes, 1971, pp... 54-55). 

Levine (1977, p. 20, his emphasis) is equally adamant about 

the importance of structure coefficients in the canonical 

case: "I specifically say that ojae has to do\this [interpret 

structure coefficients] since I firmly believe as long as one 

wants information about £he nature of the canonical 

correlation relationship,. not merely the computation of the 

[car^ftical function] scores, one >ust have the structure 

matrix." « • 

In summary, it has been suggested that contemporary 
analytic practice reflects someP improvements ovcjr more 
"traditional 'practice. For example, researchers are 
increasingly investigating multivariate problems with 
multivariate methods. There have also been some improvements 
with respect to the historically excessive use of OVA 
techniques. Hopefully, the future will bring more use of the 
multivariate general linear model, i.e., canonical correlation 
analysis, augmented by the calculation of appropriate 
coefficients, including structure coefficients. 
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Table 1 ' '< 
Hypothetical Correlation Matrix 



Variable 

Composite Achievement (] 
Reading Achievement (Y) 
Math Achievement (Z) 
Self-Concept (A) 
Family Income (B) 
Pupil Expenditure (C) 
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Table 2 
Associated Results' 



Variable 



Criterion/Solutions 
X Y Z 

BW SC BW SC BW SC 



Canonical Results 
* ,1 , II 
FC' SC^ FC SC 



Reading Achievement (Y) 1-14 .98 -.51 .22 

Math Achievement .41 1.22 .y± 

Self-Concept (A) .18 .75. .42 .80 .15 .83 .85 .78 -.27 .32 

Family Income (B) . •. .44 . 94 -.03 . 60 .10 . 83 -.13 .56 1.18 . 75 

Pupil Expenditure (C) -.07 .19 .31 .60 .06 .42 .67- .61 -.74 -.27 

RorRc .53 .50 .24 ■ « ' 



.51 



.12 , 



Note ; "BW" = beta weights; "SC" = structure, coefficients; "FC" 
function coefficients. 



